Data summary
| Name |
surface_max_full |
| Number of rows |
145692 |
| Number of columns |
26 |
| _______________________ |
|
| Column type frequency: |
|
| character |
20 |
| numeric |
6 |
| ________________________ |
|
| Group variables |
None |
Variable type: character
| geoid |
1296 |
0.99 |
5 |
5 |
0 |
70 |
0 |
| lab_sample_id |
108 |
1.00 |
6 |
12 |
0 |
1699 |
0 |
| site_code |
0 |
1.00 |
9 |
13 |
0 |
1349 |
0 |
| coc_sample_id |
4780 |
0.97 |
1 |
26 |
0 |
1478 |
0 |
| sample_type |
137253 |
0.06 |
6 |
16 |
0 |
5 |
0 |
| lab_name |
0 |
1.00 |
4 |
25 |
0 |
17 |
0 |
| lab_job_name |
12663 |
0.91 |
7 |
12 |
0 |
162 |
0 |
| collection_date |
0 |
1.00 |
22 |
22 |
0 |
193 |
0 |
| analysis_method |
108 |
1.00 |
4 |
29 |
0 |
10 |
0 |
| analysis_date |
20260 |
0.86 |
22 |
22 |
0 |
211 |
0 |
| watershed |
0 |
1.00 |
4 |
21 |
0 |
52 |
0 |
| waterbody |
108 |
1.00 |
5 |
31 |
0 |
583 |
0 |
| location_code |
0 |
1.00 |
6 |
10 |
0 |
1205 |
0 |
| huc10 |
17064 |
0.88 |
9 |
10 |
0 |
153 |
0 |
| huc8 |
0 |
1.00 |
8 |
8 |
0 |
55 |
0 |
| project |
0 |
1.00 |
4 |
13 |
0 |
2 |
0 |
| description |
216 |
1.00 |
2 |
53 |
0 |
1237 |
0 |
| additional_description |
96984 |
0.33 |
5 |
69 |
0 |
407 |
0 |
| sample_depth |
144901 |
0.01 |
6 |
10 |
0 |
3 |
0 |
| analyte |
0 |
1.00 |
14 |
25 |
0 |
108 |
0 |
Variable type: numeric
| longitude |
0 |
1.00 |
-84.70 |
1.61 |
-90.18 |
-85.65 |
-84.36 |
-83.43 |
-82.43 |
▁▂▃▅▇ |
| latitude |
0 |
1.00 |
43.43 |
1.44 |
41.76 |
42.46 |
42.99 |
43.68 |
47.47 |
▇▆▁▁▂ |
| dilution_factor |
14067 |
0.90 |
1.26 |
3.34 |
0.00 |
1.00 |
1.00 |
1.00 |
100.00 |
▇▁▁▁▁ |
| duplicate |
0 |
1.00 |
1.06 |
0.26 |
1.00 |
1.00 |
1.00 |
1.00 |
3.00 |
▇▁▁▁▁ |
| visit_id |
5616 |
0.96 |
2020146.24 |
1518.66 |
2013001.00 |
2019005.00 |
2020211.00 |
2021215.00 |
2022021.00 |
▁▁▂▇▇ |
| analyte_value |
0 |
1.00 |
2.70 |
55.14 |
0.00 |
0.23 |
1.03 |
2.00 |
11000.00 |
▇▁▁▁▁ |
Data summary
| Name |
public_max_full |
| Number of rows |
87612 |
| Number of columns |
9 |
| _______________________ |
|
| Column type frequency: |
|
| character |
5 |
| factor |
1 |
| numeric |
3 |
| ________________________ |
|
| Group variables |
None |
Variable type: character
| geoid |
2437 |
0.97 |
5 |
5 |
0 |
86 |
0 |
| system_name |
28 |
1.00 |
3 |
59 |
0 |
3128 |
0 |
| sample_date |
0 |
1.00 |
22 |
22 |
0 |
1107 |
0 |
| lab_name_code |
0 |
1.00 |
3 |
40 |
0 |
32 |
0 |
| analyte |
0 |
1.00 |
4 |
11 |
0 |
28 |
0 |
Variable type: factor
| system_type |
0 |
1 |
FALSE |
12 |
Com: 55054, Sch: 12662, Non: 6361, Non: 4642 |
Variable type: numeric
| longitude |
0 |
1 |
-84.82 |
1.20 |
-90.17 |
-85.62 |
-84.76 |
-83.78 |
-82.44 |
▁▁▅▇▆ |
| latitude |
0 |
1 |
43.36 |
1.21 |
41.73 |
42.51 |
42.92 |
44.03 |
47.47 |
▇▅▂▁▁ |
| analyte_value |
0 |
1 |
0.14 |
5.28 |
0.00 |
0.00 |
0.00 |
0.00 |
860.00 |
▇▁▁▁▁ |
Data summary
| Name |
pfas_sites |
| Number of rows |
300 |
| Number of columns |
20 |
| _______________________ |
|
| Column type frequency: |
|
| character |
16 |
| numeric |
4 |
| ________________________ |
|
| Group variables |
None |
Variable type: character
| facility |
0 |
1.00 |
9 |
103 |
0 |
300 |
0 |
| county |
0 |
1.00 |
10 |
26 |
0 |
76 |
0 |
| address |
1 |
1.00 |
4 |
88 |
0 |
299 |
0 |
| city |
1 |
1.00 |
4 |
29 |
0 |
174 |
0 |
| type |
0 |
1.00 |
7 |
22 |
0 |
14 |
0 |
| residential_wells_sampled |
0 |
1.00 |
2 |
3 |
0 |
4 |
0 |
| site_lead |
0 |
1.00 |
9 |
65 |
0 |
107 |
0 |
| site_lead_email |
0 |
1.00 |
15 |
27 |
0 |
95 |
0 |
| site_lead_phone |
0 |
1.00 |
12 |
16 |
0 |
105 |
0 |
| hyperlink |
0 |
1.00 |
84 |
134 |
0 |
293 |
0 |
| location |
115 |
0.62 |
4 |
25 |
0 |
128 |
0 |
| military |
0 |
1.00 |
2 |
4 |
0 |
3 |
0 |
| facility_date |
8 |
0.97 |
22 |
22 |
0 |
190 |
0 |
| site_background |
265 |
0.12 |
626 |
3752 |
0 |
35 |
0 |
| drinking_water_information |
265 |
0.12 |
95 |
2098 |
0 |
35 |
0 |
| anticipated_activities |
265 |
0.12 |
15 |
397 |
0 |
35 |
0 |
Variable type: numeric
| geoid |
0 |
1 |
26088.58 |
45.84 |
26001.00 |
26049.00 |
26081.00 |
26125.00 |
26163.00 |
▃▅▇▆▆ |
| longitude |
0 |
1 |
-84.82 |
1.32 |
-90.13 |
-85.70 |
-84.76 |
-83.66 |
-82.44 |
▁▁▇▅▇ |
| latitude |
0 |
1 |
43.24 |
1.12 |
41.78 |
42.48 |
42.95 |
43.54 |
47.46 |
▇▆▂▁▁ |
| zip_code |
0 |
1 |
49003.64 |
3935.16 |
20020.00 |
48453.50 |
48911.00 |
49442.25 |
94105.00 |
▁▇▁▁▁ |
Exploring Missingness in data sets and Exploartory Data Analysis
Exploratory Data Analysis Analytes analysis in both data sets
[1] "ADONA" "FOSA" "FTSA42" "FTSA62" "FTSA82"
[6] "HFPODA" "NEtFOSAA" "NMeFOSAA" "PF3ONS9Cl" "PF3OUdS11Cl"
[11] "PFBA" "PFBS" "PFDA" "PFDS" "PFDoDA"
[16] "PFHpA" "PFHpS" "PFHxA" "PFHxS" "PFNA"
[21] "PFNS" "PFOA" "PFOS" "PFPeA" "PFPeS"
[26] "PFTeDA" "PFTrDA" "PFUnDA"
[1] "CAS13252136_GenX" "CAS13252136_GenXMdl"
[3] "CAS13252136_GenXRl" "CAS16517116_PFODA"
[5] "CAS16517116_PFODAMdl" "CAS16517116_PFODARl"
[7] "CAS1763231_PFOS" "CAS1763231_PFOSMdl"
[9] "CAS1763231_PFOSRl" "CAS2058948_PFUnA"
[11] "CAS2058948_PFUnAMdl" "CAS2058948_PFUnARl"
[13] "CAS2355319_NMeFOSAA" "CAS2355319_NMeFOSAAMdl"
[15] "CAS2355319_NMeFOSAARl" "CAS2706903_PFPeA"
[17] "CAS2706903_PFPeAMdl" "CAS2706903_PFPeARl"
[19] "CAS2706914_PFPeS" "CAS2706914_PFPeSMdl"
[21] "CAS2706914_PFPeSRl" "CAS27619972_62FTS"
[23] "CAS27619972_62FTSMdl" "CAS27619972_62FTSRl"
[25] "CAS2991506_NEtFOSAA" "CAS2991506_NEtFOSAAMdl"
[27] "CAS2991506_NEtFOSAARl" "CAS30334691_PFBSA"
[29] "CAS30334691_PFBSAMdl" "CAS30334691_PFBSARl"
[31] "CAS307244_PFHxA" "CAS307244_PFHxAMdl"
[33] "CAS307244_PFHxARl" "CAS307551_PFDoA"
[35] "CAS307551_PFDoAMdl" "CAS307551_PFDoARl"
[37] "CAS335671_PFOA" "CAS335671_PFOAMdl"
[39] "CAS335671_PFOARl" "CAS335762_PFDA"
[41] "CAS335762_PFDAMdl" "CAS335762_PFDARl"
[43] "CAS335773_PFDS" "CAS335773_PFDSMdl"
[45] "CAS335773_PFDSRl" "CAS355464_PFHxS"
[47] "CAS355464_PFHxSMdl" "CAS355464_PFHxSRl"
[49] "CAS356025_33FTCA" "CAS356025_33FTCAMdl"
[51] "CAS356025_33FTCARl" "CAS375224_PFBA"
[53] "CAS375224_PFBAMdl" "CAS375224_PFBARl"
[55] "CAS375735_PFBS" "CAS375735_PFBSMdl"
[57] "CAS375735_PFBSRl" "CAS375859_PFHpA"
[59] "CAS375859_PFHpAMdl" "CAS375859_PFHpARl"
[61] "CAS375928_PFHpS" "CAS375928_PFHpSMdl"
[63] "CAS375928_PFHpSRl" "CAS375951_PFNA"
[65] "CAS375951_PFNAMdl" "CAS375951_PFNARl"
[67] "CAS376067_PFTeA" "CAS376067_PFTeAMdl"
[69] "CAS376067_PFTeARl" "CAS39108344_82FTS"
[71] "CAS39108344_82FTSMdl" "CAS39108344_82FTSRl"
[73] "CAS41997131_PFHxSA" "CAS41997131_PFHxSAMdl"
[75] "CAS41997131_PFHxSARl" "CAS646833_PFecHS"
[77] "CAS646833_PFecHSMdl" "CAS646833_PFecHSRl"
[79] "CAS67905195_PFHxDA" "CAS67905195_PFHxDAMdl"
[81] "CAS67905195_PFHxDARl" "CAS68259121_PFNS"
[83] "CAS68259121_PFNSMdl" "CAS68259121_PFNSRl"
[85] "CAS72629948_PFTriA" "CAS72629948_PFTriAMdl"
[87] "CAS72629948_PFTriARl" "CAS754916_PFOSA"
[89] "CAS754916_PFOSAMdl" "CAS754916_PFOSARl"
[91] "CAS756426581_F53BMajor" "CAS756426581_F53BMajorMdl"
[93] "CAS756426581_F53BMajorRl" "CAS757124724_42FTS"
[95] "CAS757124724_42FTSMdl" "CAS757124724_42FTSRl"
[97] "CAS763051929_F53BMinor" "CAS763051929_F53BMinorMdl"
[99] "CAS763051929_F53BMinorRl" "CAS812704_73FTCA"
[101] "CAS812704_73FTCAMdl" "CAS812704_73FTCARl"
[103] "CAS914637493_53FTCA" "CAS914637493_53FTCAMdl"
[105] "CAS914637493_53FTCARl" "CAS919005144_ADONA"
[107] "CAS919005144_ADONAMdl" "CAS919005144_ADONARl"
Surface Water Visualizations
Reading layer `cb_2018_us_county_500k' from data source
`/Users/nancyodhiambo/Library/CloudStorage/OneDrive-GrandValleyStateUniversity/Projects/Michigan_PFAS/Data cleaning, EDA & dashboard/pfas-dashboard/cb_2018_us_county_500k.shp'
using driver `ESRI Shapefile'
Simple feature collection with 3233 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
Geodetic CRS: NAD83
Reading layer `cb_2018_us_county_500k' from data source
`/Users/nancyodhiambo/Library/CloudStorage/OneDrive-GrandValleyStateUniversity/Projects/Michigan_PFAS/Data cleaning, EDA & dashboard/pfas-dashboard/cb_2018_us_county_500k.shp'
using driver `ESRI Shapefile'
Simple feature collection with 3233 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
Geodetic CRS: NAD83
Public Water Analysis
[1] "geoid" "longitude" "latitude" "system_name"
[5] "system_type" "sample_date" "lab_name_code" "analyte"
[9] "analyte_value"
PFAS sites Analysis
Combined Exploratory Data Analysis
Reading layer `cb_2018_us_county_500k' from data source
`/Users/nancyodhiambo/Library/CloudStorage/OneDrive-GrandValleyStateUniversity/Projects/Michigan_PFAS/Data cleaning, EDA & dashboard/pfas-dashboard/cb_2018_us_county_500k.shp'
using driver `ESRI Shapefile'
Simple feature collection with 3233 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
Geodetic CRS: NAD83
NAME | total_surface | total_public | n_sites |
|---|
St. Clair | 57,926.781 | 9.0 | 3 |
Oakland | 52,017.184 | 1,638.0 | 23 |
Kalamazoo | 50,842.450 | 2,173.8 | 14 |
Wayne | 32,266.011 | 58.0 | 18 |
Kent | 15,589.815 | 525.2 | 27 |
Washtenaw | 15,261.780 | 141.0 | 9 |
Livingston | 10,870.080 | 346.0 | 12 |
Genesee | 8,795.432 | 11.0 | 17 |
Muskegon | 8,284.114 | 330.7 | 17 |
Marquette | 8,101.250 | 366.0 | 2 |
In Depth Analysis of the Public Water Data set in conection with the PFAS sites, regulations and polution sites
# A tibble: 6 × 36
geoid longitude latitude system_name system_type sample_date lab_name_code
<chr> <dbl> <dbl> <chr> <fct> <chr> <chr>
1 26025 -84.9 42.3 115 Truck Stop Community … 2021/02/02… Lansing Lab
2 26139 -85.8 43.0 1239 Comstock … Non-Commun… 2019/05/17… Val
3 26005 -85.7 42.5 12th St-Martin Community … 2021/01/28… Lansing Lab
4 26125 -83.5 42.8 15 Oak Square … Community … 2020/10/22… Prein-Newhof
5 26117 -84.8 43.2 1st Baptist Ch… School 2018/07/03… Val
6 26055 -85.6 44.7 42nd Street Pl… Community … 2020/12/10… Als-Holland
# ℹ 29 more variables: ADONA <dbl>, FOSA <dbl>, FTSA42 <dbl>, FTSA62 <dbl>,
# FTSA82 <dbl>, HFPODA <dbl>, NEtFOSAA <dbl>, NMeFOSAA <dbl>,
# PF3ONS9Cl <dbl>, PF3OUdS11Cl <dbl>, PFBA <dbl>, PFBS <dbl>, PFDA <dbl>,
# PFDS <dbl>, PFDoDA <dbl>, PFHpA <dbl>, PFHpS <dbl>, PFHxA <dbl>,
# PFHxS <dbl>, PFNA <dbl>, PFNS <dbl>, PFOA <dbl>, PFOS <dbl>, PFPeA <dbl>,
# PFPeS <dbl>, PFTeDA <dbl>, PFTrDA <dbl>, PFUnDA <dbl>, Hazard_Index <dbl>
System Type | Total Hazard Index | Number of Sites |
|---|
Community Water Supply (for example Municipal Supply, Apartment, Nursing Home, Prison, etc.) | 165.4168 | 2,444 |
School | 23.9905 | 498 |
Non-Community Water Supply (Industry) | 4.5405 | 239 |
Non-Community Water Supply (Children's Camp) | 2.6135 | 139 |
Non-Community Water Supply (Child Care Provider) | 2.4250 | 178 |
Office Building | 2.3040 | 26 |
Non-Community Water Supply (Medical Care Provider) | 0.7160 | 130 |
Non-Community Water Supply (Adult Foster Care Provider) | 0.0000 | 7 |
Non-Community Water Supply (Hotel or Motel) | 0.0000 | 5 |
Park | 0.0000 | 5 |
Residential | 0.0000 | 1 |
Tribal Lands | 0.0000 | 17 |
Viewing the trajectory of PFAS sites with repeated visits and PFAS entries using the public_water_long data set